Statistical validation of mutual information calculations: comparison of alternative numerical algorithms.

نویسندگان

  • C J Cellucci
  • A M Albano
  • P E Rapp
چکیده

Given two time series X and Y , their mutual information, I (X,Y) = I (Y,X) , is the average number of bits of X that can be predicted by measuring Y and vice versa. In the analysis of observational data, calculation of mutual information occurs in three contexts: identification of nonlinear correlation, determination of an optimal sampling interval, particularly when embedding data, and in the investigation of causal relationships with directed mutual information. In this contribution a minimum description length argument is used to determine the optimal number of elements to use when characterizing the distributions of X and Y . However, even when using partitions of the X and Y axis indicated by minimum description length, mutual information calculations performed with a uniform partition of the XY plane can give misleading results. This motivated the construction of an algorithm for calculating mutual information that uses an adaptive partition. This algorithm also incorporates an explicit test of the statistical independence of X and Y in a calculation that returns an assessment of the corresponding null hypothesis. The previously published Fraser-Swinney algorithm for calculating mutual information includes a sophisticated procedure for local adaptive control of the partitioning process. When the Fraser and Swinney algorithm and the algorithm constructed here are compared, they give very similar numerical results (less than 4% difference in a typical application). Detailed comparisons are possible when X and Y are correlated jointly Gaussian distributed because an analytic expression for I (X,Y) can be derived for that case. Based on these tests, three conclusions can be drawn. First, the algorithm constructed here has an advantage over the Fraser-Swinney algorithm in providing an explicit calculation of the probability of the null hypothesis that X and Y are independent. Second, the Fraser-Swinney algorithm is marginally the more accurate of the two algorithms when large data sets are used. With smaller data sets, however, the Fraser-Swinney algorithm reports structures that disappear when more data are available. Third, the algorithm constructed here requires about 0.5% of the computation time required by the Fraser-Swinney algorithm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hierarchical mutual information for the comparison of hierarchical community structures in complex networks

The quest for a quantitative characterization of community and modular structure of complex networks produced a variety of methods and algorithms to classify different networks. However, it is not clear if such methods provide consistent, robust, and meaningful results when considering hierarchies as a whole. Part of the problem is the lack of a similarity measure for the comparison of hierarch...

متن کامل

Water Flooding Performance Evaluation Using Percolation Theory

Water flooding is a well-known secondary mechanism for improving oil recovery. Conventional approach to evaluate the performance of a water flooding process (e.g. breakthrough and post breakthrough behavior) is to establish a reliable geological reservoir model, upscale it, and then perform flow simulations. To evaluate the uncertainty in the breakthrough time or post breakthrough behavior, thi...

متن کامل

Research of Blind Signals Separation with Genetic Algorithm and Particle Swarm Optimization Based on Mutual Information

Blind source separation technique separates mixed signals blindly without any information on the mixing system. In this paper, we have used two evolutionary algorithms, namely, genetic algorithm and particle swarm optimization for blind source separation. In these techniques a novel fitness function that is based on the mutual information and high order statistics is proposed. In order to evalu...

متن کامل

Research of Blind Signals Separation with Genetic Algorithm and Particle Swarm Optimization Based on Mutual Information

Blind source separation technique separates mixed signals blindly without any information on the mixing system. In this paper, we have used two evolutionary algorithms, namely, genetic algorithm and particle swarm optimization for blind source separation. In these techniques a novel fitness function that is based on the mutual information and high order statistics is proposed. In order to evalu...

متن کامل

​Rank based Least-squares Independent Component Analysis

  In this paper, we propose a nonparametric rank-based alternative to the least-squares independent component analysis algorithm developed. The basic idea is to estimate the squared-loss mutual information, which used as the objective function of the algorithm, based on its copula density version. Therefore, no marginal densities have to be estimated. We provide empirical evaluation of th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Physical review. E, Statistical, nonlinear, and soft matter physics

دوره 71 6 Pt 2  شماره 

صفحات  -

تاریخ انتشار 2005